FLIIMP - a community software for the processing, calibration, and reporting of liquid water isotope measurements on cavity-ring down spectrometers

Precise and accurate measurements of the stable isotope composition from precipitation, land ice, runoff, and oceans provide critical information on Earth's water cycle. The analysis, post-processing, and calibration of raw analytical signals from laser spectrometers during sample analysis involves a number of critical procedures to counteract instrumental drift, inter-sample memory effects, and the quantification of total uncertainty. We present a new software tool for the post-processing and calibration named FLIIMP (FARLAB Liquid Water Isotope Measurement Processor). FLIIMP facilitates sample processing by (1) a graphical user interface that guides the user along the processing steps from corrections for memory effects, drift, and mixing ratio to calibration, and (2) allows to monitor long-term measurement system behaviour, currently for Picarro-brand water isotope analysers. Final data files are accompanied by a detailed calibration report. Being an open-source software for the major operating systems, users can adapt FLIIMP to their laboratory environment, and the community can contribute the software development. • FLIIMP facilitates post-processing, calibration and reporting for stable water isotope liquid sample analysis.• The stepwise, interactive graphical user interface reduces possibility of errors and shortens processing time.• Open source software enables future development of FLIIMP by the user community.


Installation and software structure
FLIIMP consists of a set of functions written in MATLAB available from a public git repository. 1 The software can either be run with a MATLAB installation in version 2017 or later (license required), or as compiled software available from the git repository using a free MATLAB runtime environment installation. FLIIMP is installed by cloning the repository to a local computer, or by downloading the routines as a zip file. FLIIMP is then started by running the main routine FLIIMP.m within the source directory. More detailed installation instructions, including the steps to open a sample data file provided with the software, and for installing the compiled application, are available on the git repository's wiki pages.
In order to facilitate understanding of the processing steps, but also user configuration and potential modifications of the software, we briefly describe the general structure of the program code. The GUI is contained in the set of routines starting with FLIIMP_form_ * .m, for example FLIIMP_form_preproc.m and FLIIMP.m ( Fig. 1 , top left box). More specific functionality is contained in separate routines, such as for file input (FLIIMP_read_liquid_injections.m), memory correction (FLIIMP_memory_analysis.m), mixing ratio correction (FLIIMP_wconc_corr.m), and calibration, reporting and data file output (FLIIMP_liquid_calibration.m). Configuration for a new laboratory can to a large extent be done with modification of MATLAB code by changing parameters in the comma separated values (csv) formatted text file FLIIMP_config.csv, located in the installation folder ( Appendix B ). The entire program code of FLIIMP is extensively documented to facilitate modifications by others.
During processing, FLIIMP reads and writes several types of files ( Fig. 1 , round boxes). Regarding input files, FLIIMP is in the current version only able to read csv-formatted data files from Picarro-brand L2130-i, L2140-i, and many older analysers (Picarro Inc., Sunnyvale, USA). In the future, we may add functionality to read output files from other analyser brands. FLIIMP automatically detects data files that contain measurements of 17 O, and in that case displays corresponding visualisation and processing options (Sec. 3.7). Importantly, FLIIMP does not require a specific run setup in terms of standard and sample vial arrangement and number of injections for processing. This flexibility facilitates the testing and optimisation of different run setups for different applications, which we consider an improvement over existing solutions. While we in this manuscript use examples based on the FARLAB run setup ( Appendix A.1 ), we emphasise that users can operate FLIIMP seamlessly with entirely different run setups.
All processing steps in FLIIMP, including for example the re-naming of mis-labelled samples, are performed on top of the original data files. There is thus no need to maintain a database application or to make copies of analyser output files. Original water isotope analyser output files can for example be archived in read-only mode at a central network attached storage, while FLIIMP run settings only contain the operation steps to be performed on these data files, and are archived with the processing reports and calibrated data at a different storage location.
FLIIMP can either input find data files automatically in a default file structure when instrument and measurement start time are specified, or users can manually select input files. To automatically locate files, data files are expected to be located in a common directory structure that defaults to < base path > /Instruments/ < Instrument name > /IsotopeData/. The naming of input files should then comply with a format that defaults to < device > _IsoWater_ < date > _ < time > .csv (Sec. 3.1). The path structure and naming format of run input files can be modified in configuration file FLIIMP_settings.csv ( Appendix B ).
Output data files are created in csv format (Sec. 3.6), and accompanied by a detailed illustrated report in hypertext markup language (HTML) for sharing with users, or with additional details for laboratory-internal use ( Fig. 1 , bottom right). The uniform HTML reporting in FLIIMP allows the operator to inform customers in detail about the processing of their samples, including calibrated values, uncertainty, the processing methods, and recommended acknowledgements. Text blocks to be included in the report can be specified in the configuration file FLIIMP_config.csv. Some of the FLIIMP settings, including the last choice of analyser, input and output path, and window locations, are saved in the user's home directory in file FLIIMP_settings.mat for re-use in the next FLIIMP session.
FLIIMP can be operated either with mouse and keyboard through a GUI (Sec. 3.1), or from the command line in so-called batch mode (Sec. 4.3). Both processing modes give the same result, except that the batch mode does not allow for interactive manipulation of parameters, and requires a MATLAB installation. Commonly, users will mainly operate FLIIMP in interactive mode with the GUI. It is recommended to save the processing setup for traceability and later re-use for each run, for example for re-processing data files from older runs in batch mode if updated correction methods become available.

Data processing
Data file import (Step 1) FLIIMP in interactive mode is started by running the main routine FLIIMP.m in MATLAB or the icon of the compiled app. After start, FLIIMP presents a window with the first processing panel (Step 1, Fig. 2 ). Here, the user enters input and output paths, select the instrument on which the run was performed, as well as the time range of the analysis or make a manual file selection ( Fig. 2 ). Based on the user entries, FLIIMP displays the current search path for instrument file input above the date entry fields. If a run consists of several input files, both the start and end date have to be specified (format YYYYMMDD_HHMMSS, or parts thereof). The entered date will be used as part of a pattern matching to find all relevant files (e.g., 20220501_01 * ). If a run is contained within exactly one output file, no end date needs to be specified. If users prefer, and in the case that file name does not match the expected pattern ( Appendix B.2 ), for example when using manually edited files, it is also possible to directly select and add the files to be processed by using the option Select files . Users can test FLIIMP with an input data file provided as part of the git repository 2 and described in the installation notes.

Pre-processing procedures (Step 2)
We now return to processing in interactive mode. At the start of Step 2, the software tries to load and combine all input files that match the specified instrument and time patterns specified in Step 1, and presents these in the window for sample screening ( Fig. 3 ). FLIIMP allows the user to perform four pre-processing steps before calibration that can correct for or help identify typical analytical problems during a run: (i) sample and injection outlier screening, (ii) memory correction, (iii) correction of mixing-ratio dependent baseline effects, and (iv) assignment of quality flags.

Sample screening
Liquid injection measurements involve several mechanical steps, that can cause variations in measurements [9] . A common consequence is that some (or all) injections of a sample fail, for example due to syringe wear and clogging. Some samples also need to be excluded for the final reporting, for example in case of duplicates (see Appendix A.1 ). Other situations that may require sample exclusion are partially completed runs, or pre-conditioning samples for memory correction (see Sec. 3.3).
After clicking on the Read files button, the data files are loaded into FLIIMP, and the software proceeds to Step 2. Any error messages, for example if no files could be found, are displayed in the FLIIMP log window. At any processing step, the current processing settings for a run can be saved or loaded in * .mat format using the Save or Load menu items from the File menu at the top of each window. It is recommended to save the run settings for every run or batch on the analytical system for traceability and to enable reprocessing at a later time. The structure and content of the settings files is detailed in Appendix B.4 .
In FLIIMP, individual injections or entire samples can be excluded from further processing from the GUI. In the interface for sample pre-processing ( Fig. 3 ), a table on the left side displays the sample names and IDs, along with the number of injections, and the standard deviation of mixing ratio. Orange colour denotes samples that should be inspected in detail due to one of several potential error sources, such as large standard deviation in mixing ratio or isotope -value. Clicking on a sample name will display a

Fig. 3. User interface for
Step 2 (sample pre-processing) in FLIIMP. The green symbols in the panels on the right-hand side indicate the result of memory correction applied to the raw delta values (black and blue markers). Orange text in the left panel indicate samples that either have large inter-sample mixing ratio variations, or large standard deviations of the mixing ratio during one injection.
corresponding error bar graph in the panels to the right. In addition to the mixing ratio and D, the pull-down menus allow to plot also the 18 O, 17 O, d-excess, 17 O-excess in ‰ , as well as the memory effect in% once it has been quantified (Sec. 3.3). Operators are advised to inspect each sample of a run.
Entire samples are ignored from further processing by selecting a sample in the sample list, and activating the Exclude from calibration checkbox. If a sample should only be excluded from memory correction, the Exclude from memory calc checkbox can be used. Correspondingly, samples can be included in the memory correction for their large difference in -value between sequential samples, while being ignored from calibration, since it served as a duplicate during the calibration sequence.
Individual injections from within a sample can be marked as outliers by clicking below or above the error bar symbols in one of the display panels, or by typing the corresponding injection number in the text field below the preview of the selected sample. Excluded injections will be shown in red colour in the panels. Outliers are not taken into account for the maximum and minimum vertical range of the plots, and therefore not always displayed entirely. Operators are advised to only exclude samples when justified. We have found that on some analysers the data acquisition rate sporadically drops to substantially less than the usual about 0.75 Hz of Picarro's L2140-i. The injections during these cases are distorted, and need to be excluded from further analysis. Pressing the button Exclude bad injections will identify injections with frequencies lower than ∼0.65 Hz, and add them to the list of outliers. Excluded samples and injections are listed in the detailed calibration report in report section 8 (Sec. 4).

Mixing ratio-isotope ratio dependency correction
Humidity variations between injections are a common measurement artifact of liquid water analyses. According to specifications, Picarro analysers perform optimally for a mixing ratio of between 17 ′ 000 to 23 ′ 000 ppm [9 , 17] . In this range, spectroscopic baseline effects due to variable humidity are nearly constant. If the mixing ratio falls outside a range of about 15 ′ 000 to 25 ′ 000 ppm, injections either need to be discarded, or one can apply a correction method. It has long been documented that the baseline effect of these analysers is instrument dependent, but can be characterised by careful laboratory analysis, and thus be corrected for (e.g., [1 , 3 , 18] ). Weng et al. [19] showed that the artifact is both a function of the mixing ratio and the isotope ratio, and that it differs depending on the matrix gas of the analyser. If the dependency functions for mixing ratio and isotope ratio are known, for example from the procedures presented in Weng et al. [19] , FLIIMP can automatically correct individual injections with lower or higher humidity. Corrected mixing ratio and isotope ratios are shown in magenta colour in the mixing ratio plot ( Fig. 3 , center top panel). The mixing ratio-isotope ratio correction assumes that the mixing ratio was constant during the measurement for a specific injection. In some cases, such as vaporiser septum leakage, the mixing ratio signal can show much larger slopes than the typical slope of about 200 ppm min − 1 when sampling from the vaporiser, causing a larger standard deviation for that injection. Such injections cannot be corrected properly, and should be excluded manually based on their inflated error bars in the mixing ratio display. If the correction functions are not yet determined for the used analysers, samples with low or high mixing ratio can be excluded as outliers (see Sec. 3.2.1). Even though the correction functions for vapour analysis are non-linear functions, it is possible to approximate these with linear relations within the range most common for liquid injections analysis (typically about 10 ′ 000 to 30 ′ 000 ppm). In the routine FLIIMP_wconc_corr.m, the correction functions f (x) are specified for each individual analyzer in a laboratory, and for each isotope species, in the form Thus, the corrected corr results from an offset of the raw raw by a quantity obtained from a linear fit of raw mixing ratio x to the corresponding -value correction f (x) with slope a and intercept b. Alternatively, a wider range of corrections can be achieved with hyperbolic fitting functions of the following form: Hereby, a, b, and c are fitting coefficients that depend on isotope ratio, and x = − x ref , and x ref = 20 ′ 000 ppmv. Note that as mentioned above, these correction functions are in addition to the delta value dependent on the matrix gas, and are specific to each analyser [19] . Several other correction functions have been used in the literature previously (e.g., [9 , 14] ) which could be implemented as alternative options in routine FLIIMP_wconc_corr.m.

Quality flags and quality control
In order to enable a correct interpretation of the calibrated measurement results, all available information about analytical data quality, such as measurement artifacts, and the applied correction, need to be documented clearly. Therefore, quality flags are assigned to each sample that signify information and warnings about the run, and about the applied post-processing steps. Operators of FLIIMP should check for combinations of several flags, which often indicate problems during the analytical procedure, such as a worn-out syringe. All flags are written out as a single-value bit combination in the calibrated data file ( Table 1 ), and detailed further in the data report. For example, a sample where the mean of the standard deviations from each individual injection exceeds 200 ppmv would receive flag value 1, which can indicate problems with the peak shape. A sample where the mean of the standard deviations of the mixing ratio of all considered injections is larger than 500 ppmv would receive a flag value 2, potentially indicating problems with the syringe or septum. A sample where in addition the standard deviation of instrument temperature exceeded 0.15 K would receive the quality flag value 2 + 16 = 18. Measurements with no issues at all have a flag value of 0.

Memory correction (Step 3)
Step 3 in the FLIIMP processing sequence consists of the user interface to correct for inter-sample memory effects ( Fig. 4 ). FLIIMP memory correction uses a two-component model, and assists the user by automatically obtaining the optimal fitting parameters. This step is available from the button Memory correction in the pre-processing window ( Fig. 3 ). First, the underlying correction algorithm is described, before describing the corresponding GUI.

Memory correction method
An important measurement artifact during laser spectroscopy of liquid samples are memory effects. Memory effects result from the carry-over of sample material on walls and other parts of the analyser between samples. Several methods have been proposed in the past to reduce memory effects between injections, and to correct for memory effects as part of post-processing [9] . FLIIMP has the option to correct all samples of a run based on memory correction coefficients obtained typically from a set of injections of calibration or drift standards, that have a sufficiently large number of injections. Similar to Gröning [10] , the memory correction in FLIIMP is based on a model that combines the effect of a slow and a fast memory component for each injection j: The two memory components contribute to memory as described by exponent a and b for the fast and slow processes, respectively, that decreases to 0 at N − n injections, where N is the total number of injections. The total memory M j is then obtained by a weighted mean of both memory components, using the weight factor w where w ∈ [0 . . . 1] ( Eq. (3) ), multiplied with the maximum, initial memory C 0 for injection j = 1. Thereby, C 0 is obtained in a three-step process. First, the final value of the raw -value, ỹ i , is calculated for each sample y i , using the last n of N injections: Then, the difference between the final -values of consecutive sample i − 1 and i is used to compute the percent memory M i j affecting each sample i at injection j: Note that negative values are not excluded during memory correction to prevent a positive bias when fitting the memory correction. When sequential samples have small difference in isotope ratio, the memory quantification becomes more uncertain. Therefore, all available memory estimates are weighted by their relative contribution to the sum of the absolute value of the step between final -values from consecutive samples i − 1 and i: Weighting of each memory estimate for each injection by i gives more weight to memory estimates that have been obtained from a large inter-sample difference of -values. Based on the weighted individual memory estimates for each sample and injection, M i j ⋅ β i , the memory model ( Eq. (3) ) can then be fitted, using root mean square error (RMSE) minimisation. Averaging over a larger number of injections n lowers sensitivity of the correction to other sources of variability, while requiring sufficiently many injections.
After fitting, the computed memory correction M i j for each sample i at injection j ∈ [2 … N] is forced to zero at the last n values using a linear deprecation of the form FLIIMP then applies the computed memory correction M i j , resulting from the deprecated superposition of the two exponential correction functions to all samples. In the program code, all memory processing is done in routine FLIIMP_memory_analysis.m.

Applying memory correction from the GUI
Memory correction is started in the GUI by pressing the button Memory correction in the pre-processing panel ( Fig. 4 ). One can either choose to select the samples for the memory correction based on the number of injections (e.g., 12 for standards, compared to 6 for samples), or from the sample name identifying a standard. Several standards can be specified when separated by blank characters in the field Standards. FLIIMP displays black cross markers for the memory from all individual injections that match the selection criteria. In addition, and for the fitting, error bar markers show the weighted average of the memory estimates for each injection.
Pressing the Autofit button will run through a combination of parameters a, b, and w, to find the best fit from minimising the RMSE. When autofitting is done, the lines for slow memory (blue), fast memory (green), and total (red) will be updated in the display, as well as the numbers in the interface. Next to the display, the initial memory and RMSE value are displayed. If needed, it is possible to adjust the results of the autofitting manually. The display will update after pressing the enter key when editing fitting parameters. The procedure needs to be done for each isotope species ( D, 18 O, and 17 O, if applicable). Importantly, users need to check the box Use correction for parameter if they wish to apply the memory correction for each isotope species. Once a memory correction has been established, and the interface window is closed, the memory corrected values are shown in green colour in the panels on the right in the pre-processing window ( Fig. 3 ).
In some situations, for example when measuring samples within a narrow range of isotope ratios, the memory correction is either inefficient, or can even degrade the calibrated results. Therefore, memory correction can be deactivated for each isotope species in the memory correction panel, or altogether in the run settings panel (Sec. 3.5.2). To decide whether to activate memory correction or not, it is important to review the quality of the correction for each sample after closing the memory correction panel. Operators are advised to check for all standards and samples that the memory correction has reduced the memory effect sufficiently, as evidenced by the green markers being nearly flat, and their 1-standard deviation ( Fig. 3 , green dashed lines) near the long-term reproducibility ( Fig. 3 , brown error bar).
In the FLIIMP report, a figure similar to Fig. 4 is displayed for all species in the detailed calibration report section 5. In addition, the memory correction is documented in the form of the initial memory C 0 , as well as the exponents a, b for all species. Both quantities are also included in the parameter data file for performance monitoring (see Sec. 4). Initial memory for D and 18 O can be used to monitor system performance over longer times. Increased memory may for example indicate salt buildup, or other contaminations, and help to identify when cleaning of the analytical system is needed [14] . The trend of the initial memory and several other run parameters can be inspected for a selected instrument and time period using the menu item Tools , Parameter analysis (not shown), implemented in routine FLIIMP_form_memdrift.m.
Note that memory correction only works between two sequential samples with a sufficiently large difference in -values. The threshold value for memory correction can be specified in the FLIIMP settings variable memory_limits, and is by default 12.0 ‰ for D and 1.5 ‰ for 18 O and 17 O. Experience with the measurement system indicates, however, that memory effects also carry over between more than two sequential samples, even though to a lesser extent. Such memory effects can currently not be corrected by FLIIMP post-processing. If there are large steps in isotope ratio expected between samples, duplicating samples directly after one another is an efficient measure to minimise such a memory effect. The duplication of sample material is also used in the run setup at FARLAB (see Appendix A.1 ). In such cases, the duplicate samples are often excluded from calibration, but may be used for memory correction.

Changing sample description and sample ID (Step 4)
Step 4 in the FLIIMP processing sequence is a user interface to change sample names and sample identifiers (IDs). The distinction between sample names and sample IDs allows to trace each sample with a unique identifier during analytical procedures, while keeping the customer's sample names intact. However, sometimes operators make mistakes in the naming of either sample names or identifiers, such as swapping the position of the two. To avoid manually changing the input csv files, FLIIMP allows to easily re-assign and swap sample names and sample IDs. Thereby, the name changes are applied as a (saved) operation during the sample processing, keeping the original data files without modification.
To edit individual sample names and identifiers, users click on a line in the table on the left, and edit directly in the edit fields in the top right ( Fig. 5 ). If all or several changes are desired, the lower right controls allow to swap all sample names and sample IDs, or to assign a new sample ID, based on a base string, and appending a numbered sequence starting at the value provided in Start index . Samples with names provided in the field Standards (space separated) are thereby skipped over. Both, original names, and original sample IDs can be restored any time during the renaming step. Modified table entries are highlighted in orange colour.

Sample calibration/normalisation (Step 5)
Step 5 in the FLIIMP processing sequence is the specification of the information needed for producing the report, and for performing the calibration (or normalisation) to VSMOW-SLAP scale. First, the instrument drift during the run can be quantified and assessed. Furthermore, inspection of control standards allows to judge the performance of the measurement system during a given run. Finally, the processing report is exported, containing all information for internal monitoring or for an end user.
All calibration parameters are specified in the GUI during step 5 ( Fig. 6 ). These parameters include the calibration set, the calibration standards (with names separated by space), and the drift and control standards. It is also possible to activate or deactivate different pre-processing steps (humidity dependency correction, drift correction, and memory correction) during this step. In particular without memory correction, it may be desirable to restrict the processing to the last n injections of a sample. Specifying a value   of − 1 will use all available injections in the processing of each sample. Operators will also at least need to provide a project name and a run identifier. After completion of the specification, FLIIMP is ready for calibration of the samples, which is started by a click on the Calibrate button. FLIIMP will then create the output files in a new folder, including the calibration report (Sec. 4).

Drift correction
Drift correction is done before the actual calibration step. First, the slope of the drift is determined from a linear regression of the raw delta values of each of the drift standards against time. Then, the drift is shifted vertically, such that zero correction is applied at the middle of the run, since it is equally likely that the instrument drift is low at the beginning as at the end of the run, thus weighting the correction equally [10] . Instrumental drift is in particular relevant for runs that extend over several days. The drift correction is contingent on a suitable run setup. In an analytical system that can drift, the time used for measuring standards and for measuring samples needs to be optimised, to be able to detect and correct for drift, while measuring sufficiently many samples. A major complication thereby is the separation of instrument drift from memory effects, as well as other sources of variability. As described in the run setup ( Appendix A.1 ), FARLAB uses duplicate samples of the drift standards to reduce the impact of memory effects on the drift correction. Therefore, one needs to exclude the first sample of the drift standard duplicates from calibration during sample pre-processing (Sec. 3.2.1).
With FLIIMP, laboratories can choose other suitable run setups to correct for drift. The drift correction is calculated in FLIIMP from the deviation of all available repeated standard measurements (drift, calibration, control) from their respective initial uncalibrated value. A linear fit to these offset values provides the mean drift of the measurement system in units of ‰ day − 1 ( Fig. 7a ,b). The interpolated linear drift is then subtracted from all samples before calibration. The residuals from the linear drift estimate quantify the deviation from assumed linearity ( Fig. 7c ,d). From the GUI for step 5, the drift correction result figure can be previewed using the button Drift preview ( Fig. 6 ). If desired, drift correction can be deactivated in the GUI during step 5 using the checkbox Drift correction .

Calibration to VSMOW-SLAP scale
Calibration (or normalisation) to VSMOW-SLAP scale is commonly done using secondary laboratory standards that have been calibrated against primary standards available from IAEA. The assigned values of secondary laboratory standards are provided as csv files in subfolder standards in the FLIIMP source code directory. As assigned values may change over time, or different waters may be added, new calibration sets can be added to that folder using a specific naming scheme ( Appendix B.1 ). In addition to the assigned value, the combined uncertainty of the secondary (and primary) standards needs to be specified to allow for a correct calculation of the combined uncertainty of the calibrated samples. When reprocessing in batch mode, a different calibration sets can be selected using the settings parameter calset ( Appendix B.4 ).
Calibration itself is done following recommended IAEA procedures [12] : Hereby, δD c smp is the calibrated value of the sample, δD c LS 1 and δD c LS 2 denote the calibrated values of working standards 1 and 2, and superscript w denotes the raw values of the standard and sample. Corresponding equations exist for the other isotope species.
The processing report contains a figure to control the quality of the calibration to VSMOW-SLAP scale. Samples are displayed at their position along the calibration line, which informs about the contribution from the calibration standard uncertainty to combined uncertainty. In addition, warnings are issued in the log window and report if samples are outside the standard range, and a corresponding warning flag is set (Sec. 3

Uncertainty calculation
Combined uncertainty of the calibrated samples is estimated from an error budget, involving the following components ( [ 10 , 14] ): 1. u(h) 2 : variance from the assigned uncertainty of isotopically heavy standard h with respect to VSMOW-SLAP 2. u(l) 2 : variance from the assigned uncertainty of isotopically light standard l with respect to VSMOW-SLAP 3. u(H) 2 : variance from uncertainty expressed as standard error of the mean (SEM) of measured values of isotopically heavy standard H (SD for a single measurement) 4. u(L) 2 : variance from uncertainty expressed as SEM of measured values of isotopically light standard L (SD for a single measurement) 5. u(m) 2 : variance of sample (unknown). Approximated by repeated measurements or by long-term reproducibility and repeatability. If no value for the long-term reproducibility is specified ( Table B.1 ), this value is estimated by the scaled SEM of each repeated sample measurement. The combined uncertainty u(c) is then calculated from the square root of the squared sum of all error components in the budget using where s h , s l , s H , s L and s m are sensitivity terms of the form s h = f∕ δ h , corresponding to each of the five elements of the error budget. Hereby, f represents the calibration function ( Eq. (9) ). Near the centre of the two-point calibration curve, samples obtain a lower uncertainty than near the edges [11] . Adding more standards to obtain a three-point calibration curve, for example, has been shown to be less important than obtaining precise and accurate measurements for a two-point calibration [16] . FLIIMP does therefore currently only offer a two-point calibration. The FLIIMP data report explains the processing steps and the calculation of combined uncertainty for each batch of samples in report section 4. The detailed report also contains a table specifying the full uncertainty budget of each sample. An important consequence from the uncertainty budget is that calibration standards spanning a wider range will improve the uncertainty of calibrated samples located in between [16] . However, using calibration standards with very different isotope composition introduces more memory, and requires either more injections or other methods to obtain an accurate calibration. In the currently used setup for runs at FARLAB, for example, all standard vials are duplicated or triplicated ( Appendix A.1 ). The first vial of a standard allows to quantify memory, but is ignored during the calibration. The second and third vial of the same standard are then used during calibration only. In the case of far different -values of the calibration standards, the final value of the second vial can be up to 1 ‰ different in D after 12 injections due to inter-vial memory. In the future, it may be possible to attempt correcting for inter-vial memory effects in FLIIMP. We emphasise here again that users may choose different run setups to reduce the combined uncertainty, which FLIIMP will be able to handle seamlessly.

Control standards
The overall analytical quality of a run can be assessed from the evaluation of a set of control standards. Control standards are for example included in the measurement sequence of the standards at the beginning of a run. The calibrated drift standards can also serve to assess the quality of a run. In the calibration report, the calibrated values of the control standards are displayed in comparison to their assigned value. Fig. 8 shows an example where WICO2 was used as a control standard. At FARLAB, runs are further inspected and potentially rejected if the majority of the control and drift standards is further than ± 2 times the long-term reproducibility of the measurement system from the assigned value ( Fig. 8 , dashed and dotted green lines).

Output files and data report
After a click on the Calibrate! button in step 5, FLIIMP creates all output files and the calibration report in a new folder within the previously specified output folder (Sec. 3, Fig. 2 ). Two versions of the calibration report can be created, a user report, and a detailed report ( Table 2 ). The user report will contain only the information that is needed by regular end users, including a table with the calibrated results themselves, a description of the processing steps, a recommendation for acknowledgements, and the calibrated data file with the results for each individual sample. The detailed report contains additional sections that inform about the performance of the measurement system, and the corrections for each sample. For most end users such additional information is either unnecessary, would require extensive description to be useful, or it could be considered non-public information of the laboratory. Therefore, external end users generally receive the simpler user report, whereas the detailed report is archived at the laboratory for traceability and as documentation in case of reprocessing.
The name of the output folders is created from a combination of the project name, the current run identifier, the processing date, and the report type in the format < project_name > _ < run_ID > _ < processing_date > _ < report_type > , for example 2022-02-HS_run01_20220509_internal. The project name and run identifier should not contain space or file system characters, such as forward   and backward slashes, and colons. Within the output folder, the report is created as a HTML document named index.html, to be viewed with any regular browser ( Table 3 ). The HTML report contains links to the result files in csv format, located in the output folder ( Table 3 ). In addition to the calibrated data file and the calibrated data summary, the detailed report contains additional output files for monitoring which samples have been run at what time and position (accounting file), for long-term monitoring of the performance of the measurement system (parameters file, standards file), and for assessing the impact of the corrections and calibration on each measurement (alldata file) ( Table 3 ).

Processing of data files with 17 O measurements
FLIIMP has been designed throughout as a triple-isotope enabled processing tool. The processing of measurements that include the parameter 17 O in addition to D and 18 O require additional user attention [15 , 17] . In FLIIMP, additional display and processing options become available if 17 O values are detected in the input files. In the user interface, the 17 O and 17 O-excess are available as additional display menu options during the sample screening (Step 2). Memory effects for 17 O can be corrected during the regular memory assessment (Step 3). Finally, in the calibration settings (Step 5), the user can select whether to include 17 O and 17 O -excess in the output or not using the Parameter selector. If 17 O output is selected, additional columns are included in the output files for 17 O, 17 O-excess, and their uncertainty. In addition, the format of both 18 O and 17 O become 4-digit precision, rather than 2-digit, following the recommendations of Schoenemann et al. [15] . In order for 17 O analysis to work correctly, the assigned values of 17 O for the calibration, drift and control standards need to be included in the csv files defining the calibration standards. While FLIIMP at its present stage enables processing 17 O files, users are advised to be aware of additional requirements of this type of analysis, such as sensitivity to drift and large uncertainties from calculation of the 17 O-excess, and take corresponding measures during sample analysis as recommended in the literature.

Quality assurance
Due to the consistent treatment of runs, and the output of control parameters of the measurement system, FLIIMP gives direct access to information about the long-term performance of the measurement procedures and specific analysers. In addition, during the development of FLIIMP, care has been taken to test and verify all functionality of the software. A set of test routines allows to produce artificial data sets with defined error properties for drift, memory, or humidity variation, that can then be used to check for correction implementation of new or modified correction routines, and to compare to other calibration methods. Such essential parts of the quality assurance are detailed and discussed below.

Long-term reproducibility
The long-term reproducibility is an important metric for quality control at a laboratory. Repeated measurements of a control standard over time allows laboratories to quantify their long-term reproducibility. FLIIMP facilitates access to this information by an additional tool that scans through the archived output data files from previous FLIIMP calibrations for a specified time period. The interface to inspect and compute the long-term reproducibility is available from the menu item Tools , Long-term reproducibility ( Fig. 2 ). After selecting the menu item, the LTR tool window appears, one can specify the time period and instrument to be analysed ( Fig. 9 ). The assessment results can be copied for further processing into text and spreadsheet applications. At FARLAB, for example, long-term reproducibility has been quantified from a long-term average of calibrated control/drift standard measurements. During 2016-2020, drift standard DI showed a long-term reproducibility of 0.491 ‰ for D, and 0.076 ‰ Fig. 9. User interface for the assessment of long-term reproducibility within FLIIMP. Shown is an example for 7 months of measurements of drift standard DI2 (within-run 1-standard deviation given as error bars) on a Picarro L2140-i analyser with serial number HKDS2038. Here, filtering of runs with a deviation from the assigned value larger than 3 times the long-term reproducibility has been activated. Realistic use case all of the above and preconditioning sample Test 1 to 5 combined, plus a preconditioning sample for 18 O, estimated from the 1-standard deviation of all accepted runs. Runs with large offsets in the control and drift standards ( > 3 times LTR difference from assigned value), that are archived but have been discarded and repeated, have been excluded from this assessment. Other drift standards result in similar estimates, albeit for shorter time periods. In 2020, drift standard DI2 replaced DI, with a long-term reproducibility of 0.446 ‰ for D, and 0.052 ‰ for 18 O after filtering for runs with a large standard deviation ( Fig. 9 ). To enable inclusion the LTR values in combined uncertainty computation, these need to be specified in the configuration file as parameter longTermReproducibility ( Table B.1 ).

Comparison to SICalib
We have made a comparison with artificially created input data files processed by both FLIIMP and SICalib (see Sec. 4.4). Thereby, the drift and the memory factors are exactly known, since they have been used in the creation of the artificial data files. Comparisons with these artificial data files with imposed drift show that the memory correction results in differences of the final results that can exceed analytical uncertainty, confirming that the distinction between drift and memory is a main factor for processing. We have also compared the calibrated results between FLIIMP and SICalib for actual runs performed on two different Picarro analysers at the water isotope laboratory at AWI, Germany. When the same actual run files are processed, the root-mean square error amounts to 0.0053 to 0.0227 ‰ for 18 O, and 0.032-0.136 ‰ for D, with a mean error (bias) of − 0.0007 to 0.000 ‰ for 18 O and 0.016-0.027 ‰ for D, substantially below typical analytical uncertainties. Since differences in all other processing results appear virtually negligible, we conclude that one can expect equally valid results from the using either of the liquid sample processing tools.

Processing in batch mode
It may sometimes be necessary to re-process runs that have already been processed interactively with FLIIMP. For example, reprocessing may be recommended after recalibration of internal laboratory standards. Other situations that can induce the need to re-process runs with FLIIMP are software changes, such as improved processing or correction methods. The capability to re-process runs in batch mode has also been very useful during the development of FLIIMP to obtain consistent data file formats, allowing to assess the long-term characteristics of the measurement system, and the long-term reproducibility (Sec. 4.1). Batch mode requires a full MATLAB installation with license.
In order to run FLIIMP in batch mode, one or more settings files ( * .mat) saved from interactive mode are needed. The setting files are first loaded into a MATLAB variable in a new, separate MATLAB script. Then, the settings variable can be modified as needed, before calling the batch-mode processing routine FLIIMP_batch.m from within the script, using the modified settings variable as an argument. An example for a batch mode script that reprocesses a run while changing from detailed to short reporting is provided in routine FARLAB_batch_run.m in the FLIIMP repository. A description of all elements of the settings structure that can be modified is given in Appendix B.4 .

Test routines for optimised software development
All steps and properties of the FLIIMP software have been quality checked using regular and artificial data files. Artificial data files consist of a text file in the same csv format as obtained from regular analyser output during liquid injection measurements, but with specific characteristics. For example a fixed linear drift in -value is imposed across the run. In total, 6 tests have been designed to include drift, memory effects, humidity variation, missing or bad injections, and a combination of several factors ( Table 4 ). The input and output file of the tests are stored in subfolder tests in the FLIIMP source directory. 3 Test run settings for test 1-6 can either be loaded in the user interface (e.g., test01_run_settings.mat), or users can select instrument HKDS2038 and date 20200101 to 20200106 during processing Step 1, given that the input path points to folder test. Test run files also contain 17 O.
Software tests are common practice in software development to identify software error or other unwanted side effects. The artificial input data files are used to evaluate the correct performance of FLIIMP after, for example, including new or improved functionality. The tests are run using routine batch_FLIIMP_tests.m, which batch processes each test run, then calculates the RMSE of the calibrated values in comparison to assigned values. Tests are passed if the differences in calibrated results are smaller than a specified threshold. Tests of single factors for the current FLIIMP version show close agreement between assigned and calibrated values ( Fig. 10 , Test 1,2,4,5). Larger deviations are noticeable in the test involving memory correction, and a combination of all factors (Test 3, 6). This larger difference can be mainly ascribed to the previously mentioned difficulties to separate instrument drift and memory effects during post-processing. New test input files can be created using the routine create_FLIIMP_tests.m.

Final remarks
FLIIMP is a flexible, new analytical tool for processing liquid sample measurements for stable water isotopes from commercially available laser spectrometers, that runs on different operating systems. FLIIMP provides calibrated results that can be considered equal to other existing solutions, while proving several innovations that reduce time efforts during processing, make operations less error-prone, and support the traceability of results by consistent documentation.
The availability of a GUI allows for user guidance and interactive adjustment of parameters, for example during sample screening and memory correction, and the quality assessment of individual runs, and for the analytical system. Runs may be reprocessed in batch mode if needed, to obtain consistent long-term monitoring of a measurement system, including long-term reproducibility. Standardised reports in HTML format contain the information for either customers or laboratory operators. Data files are made available in csv format for simple use in either spreadsheet applications or data analysis software. FLIIMP thereby contributes to overall more standardised processing in laboratory workflows, and supports principles of good laboratory practice, such as traceability.
While FLIIMP can be run without MATLAB license using MathWork's runtime environment, an important current limitation of FLIIMP is that further development requires a costly software license. While on the longer term a transition to the license-free python language can be achieved, the stand-alone compiled version of FLIIMP are available as a first-order remedy. Another important limitation is that FLIIMP currently is only able to read input files from Picarro-brand analysers. With FLIIMP being available in a publicly accessible repository, the community may actively contribute with improved processing methods to the future development of FLIIMP.
Processing of measurement files and artificial input files show that correcting for memory across injections and across samples, and the separation between drift and memory effects, remain a major challenge for liquid water analysis for the instruments used here. Flexible post-processing tools, such as FLIIMP, are a valuable asset in finding run setups and other developments that further improve correcting methods and analytical procedures, thereby supporting reliable scientific interpretations of stable water isotope measurements from different laboratories.

Ethics statements
Not relevant for this work.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. • Specify calibration and drift standards, with standard names separated by spaces • Specify averaging of injections for first and all other samples, use − 1 to include all • Preview calibration and drift using the corresponding buttons • Select the report type for internal and/or external use. External customers usually receive an abbreviated version of the report that does not disclose lab internal instrument data. • Save settings any time underway, but at the latest now • Press calibrate and watch for report being generated • Processing Step 6: Check results • Open report file index.html in a browser and inspect the results, in particular the memory correction, drift correction, calibration, and uncertainty. • Return to FLIIMP to make adjustments if needed

Appendix B. Configuration of FLIIMP
While FLIIMP can be tested without modifications using the provided sample input file, laboratories eventually will have to adapt FLIIMP to their specific environment. All configuration settings are collected in csv file FLIIMP_config.csv located in the source directory ( Table B.1 ). In order to adapt FLIIMP to a new laboratory, the settings within this file need to be adapted. Some settings require adjustment in all cases, such as the laboratory and instrument names. Other settings are recommended to be adjusted to include the correct method description and acknowledgement sections. Optionally, one can specify the correction functions of the mixing ratio-isotope ratio dependency. After these initial modifications, FLIIMP can be run interactively or in batch mode in a new laboratory. New standards can be added as described in the next section. More complex changes involve modifications in other parts of the program code and are described further below. Laboratories will need to specify their laboratory standards during FLIIMP configuration. Available standards are specified as csv files in the subfolder standards. A new calibration set is most easily created by copying an existing calibration set to a new name, specifying a number at the start of the file name followed by a dash (e.g., 004-lab-standards-2023.csv). Each row in the csv file corresponds to a standard, and each column contain the assigned value and uncertainties for the three isotope species (dD, dD_u, d18O, d18O_u, d17O, d17O_u for D , 18 O, and 17 O). A description string allows to identify the used calibration in the data report. During the build-up of the user interface, the calibration sets in the standards folder are automatically identified and made available from the respective pull-down menu (e.g., Fig. 7 , Calibration standards).
Appendix B.2. Input file format and pathname templates for file input FLIIMP currently parses analyser input files in expectation of a specific text file format for Picarro-brand analysers, e.g., HKDS2039_IsoWater_20220101_102000.csv. The order of data columns can vary, and if not all fields are present in a file, a warning message is displayed, and FLIIMP attempts to construct missing information if possible. The file input takes place in routine FLIIMP_read_liquid_injections.m, and input functionality for other analysers brands may be added here.
At FARLAB, the liquid injection data files from all analysers are archived in a file structure with a base path contained in variable inputPath (e.g., /Volumes/farlab/. Within the base path location, FLIIMP expects a sub-folder Instruments. Therein, subfolders named as the instrument identifiers are expected (e.g., HKDS2039, each containing a sub-folder IsotopeData, which then finally contains all liquid injection data files. The complete search path to the data files is thus composed by joining several parameters in the form < inputPath > /Instruments/ < instrument > /IsotopeData/. When adapting FLIIMP to a new lab, one can either choose to reproduce the file structure expected by FLIIMP, or modify the parameter inputPattern in FLIIMP_settings.csv as needed (Sec. 2).

Appendix B.3. Specifying mixing ratio -isotope ratio correction functions
Measurements at varying mixing ratios are affected by changes in base line during spectroscopic analysis (see Sec. 3.2.2). In FLIIMP, corrections to the mixing ratio-isotope ratio dependency are implemented in file FLIIMP_wconc_corr.m. In the routine, a case statement selects from which analyser the data originate. Corrections are either specified in a linear relation between corrected integer (0/1) use humidity dependency correction useMemoryCorrection integer (0/1) use memory correction writeReport integer (0/1/2) write long or short report, or both and raw signal or in a more complex correction function that also includes the -value (Sec. 3.2.2). The corrected -values are then assigned to a new variable ending in _corr. If no correction functions are available, the corrected and uncorrected values are identical, and a warning is displayed in the log. This behaviour can be deactivated by unchecking the humidity correction check box during Step 5, or permanently by uncommenting the statement use = 0 in routine FLIIMP_wconc_corr.m.